Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data
نویسندگان
چکیده
BACKGROUND Due to the low statistical power of individual markers from a genome-wide association study (GWAS), detecting causal single nucleotide polymorphisms (SNPs) for complex diseases is a challenge. SNP combinations are suggested to compensate for the low statistical power of individual markers, but SNP combinations from GWAS generate high computational complexity. METHODS We aim to detect type 2 diabetes (T2D) causal SNP combinations from a GWAS dataset with optimal filtration and to discover the biological meaning of the detected SNP combinations. Optimal filtration can enhance the statistical power of SNP combinations by comparing the error rates of SNP combinations from various Bonferroni thresholds and p-value range-based thresholds combined with linkage disequilibrium (LD) pruning. T2D causal SNP combinations are selected using random forests with variable selection from an optimal SNP dataset. T2D causal SNP combinations and genome-wide SNPs are mapped into functional modules using expanded gene set enrichment analysis (GSEA) considering pathway, transcription factor (TF)-target, miRNA-target, gene ontology, and protein complex functional modules. The prediction error rates are measured for SNP sets from functional module-based filtration that selects SNPs within functional modules from genome-wide SNPs based expanded GSEA. RESULTS A T2D causal SNP combination containing 101 SNPs from the Wellcome Trust Case Control Consortium (WTCCC) GWAS dataset are selected using optimal filtration criteria, with an error rate of 10.25%. Matching 101 SNPs with known T2D genes and functional modules reveals the relationships between T2D and SNP combinations. The prediction error rates of SNP sets from functional module-based filtration record no significance compared to the prediction error rates of randomly selected SNP sets and T2D causal SNP combinations from optimal filtration. CONCLUSIONS We propose a detection method for complex disease causal SNP combinations from an optimal SNP dataset by using random forests with variable selection. Mapping the biological meanings of detected SNP combinations can help uncover complex disease mechanisms.
منابع مشابه
Single-nucleotide polymorphism of rs11061971 (+219 A>T) in adiponectin receptor 2 (AdipoR2) gene and its association with risk of type 2 diabetes among an Iranian population
Background and Objectives: Genetic modifications in the adiponectin receptor 2 (AdipoR2) gene can affect phenotypes associated with insulin resistance and diabetes. The purpose of this study was to evaluate the possible role of genetic modifications in the AdipoR2 gene, to determine the frequency of genotypes and polymorphism alleles of this gene at rs11061971 (+219 A>T), and to investigate its...
متن کاملMouse Models of Human GWAS Hits for Obesity and Diabetes in the Post Genomic Era: Time for Reevaluation
In recent years, genome-wide association studies (GWAS) have identified hundreds of loci and thousands of single-nucleotide polymorphisms (SNPs) associated with type 2 diabetes mellitus (T2DM) and obesity traits [such as body mass index (BMI) and waist–hip ratio (WHR)] in the human population (1–4). The vast majority of these SNPs are in non-coding regions of the genome and distal to promoters,...
متن کاملFunctional annotation of sixty-five type-2 diabetes risk SNPs and its application in risk prediction
Genome-wide association studies (GWAS) have identified more than sixty single nucleotide polymorphisms (SNPs) associated with increased risk for type 2 diabetes (T2D). However, the identification of causal risk SNPs for T2D pathogenesis was complicated by the factor that each risk SNP is a surrogate for the hundreds of SNPs, most of which reside in non-coding regions. Here we provide a comprehe...
متن کاملLack of Association of Mitochondrial A3243G tRNALeu Mutation in Iranian Patients with Type 2 Diabetes
Many kinds of mutations in mitochondrial (mt) DNA have been reported to be related to the development of Diabetes Mellitus (DM), this type of diabetes has also been shown to be influenced by other genetic factors and/or environmental factors. Among them, tRNALeu(UUR) and its adjacent mtDNA NADH dehydrogenase subunit 1(ND1) region within the mt genome are linked to high susceptibility to DM. A p...
متن کاملDNA Polymorphisms at Candidate Gene Loci and Their Relation with Milk Production Traits in Murrah Buffalo (Bubalus bubalis)
DNA polymorphism within diacylglycerol transferase 2 (DGAT2) / monoacyl glycerol transferases 2 (MOGAT2), leptin and butyrophilin genes were analysed using PCR-SSCP in Murrah buffalo. The single strand conformation polymorphism (SSCP) analysis of amplified gene fragment in exon 5 of MOGAT2, exon 3 of leptin and intron 1 of butyrophilin gene revealed different patterns. A, B and C showed the fol...
متن کامل